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Abstract 

The alternating direction method of multipliers (ADMM) is widely used in solving structured 
convex optimization problems due to its superior practical performance. On the theoretical side 
however, a counterexample was shown in [7] indicating that the multi-block ADMM for minimizing 
the sum of N {N > 3) convex functions with N block variables linked by linear constraints may 
diverge. It is therefore of great interest to investigate further sufficient conditions on the input side 
which can guarantee convergence for the multi-block ADMM. The existing results typically require 
the strong convexity on parts of the objective. In this paper, we present convergence and convergence 
rate results for the multi-block ADMM applied to solve certain A-block (A > 3) convex minimization 
problems without requiring strong convexity. Specifically, we prove the following two results: (1) the 
multi-block ADMM returns an e-optimal solution within 0(l/e^) iterations by solving an associated 
perturbation to the original problem; (2) the multi-block ADMM returns an e-optimal solution within 
0(l/e) iterations when it is applied to solve a certain sharing problem, under the condition that the 
augmented Lagrangian function satisfies the Kurdyka-Lojasiewicz property, which essentially covers 
most convex optimization models except for some pathological cases. 


Keywords: Alternating Direction Method of Multipliers (ADMM), Convergence Rate, Regularization, 
Kurdyka-Lojasiewicz property. Convex Optimization 


1 Introduction 

We consider the following multi-block convex minimization problem: 

min /i(a;i) -b / 2 (x 2 ) H-b /iv(xAr) 

S.t. AiXi+A2X2-\ - +ANXN = b (1.1) 

Xi € Xi, i = 1,... ,N, 

where Ai G h G C R”® are closed convex sets, and fi : R”® —> R are closed convex 

functions. One effective way to solve dni), whenever applicable, is the so-called Alternating Direc¬ 
tion Method of Multipliers (ADMM). The ADMM is closely related to the Douglas-Rachford [H] and 
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Peaceman-Rachford [32] operator splitting methods that date back to 1950s. These operator splitting 
methods were further studied later in [30l HS] HTJ [l2] . The ADMM has been revisited recently due to its 
success in solving problems with special structures arising from compressed sensing, machine learning, 
image processing, and so on; see the recent survey papers um for more information. 

The ADMM is constructed under an augmented Lagrangian framework, where the augmented La- 
grangian function for ()l.l|i is defined as 

N I ^ 

. ,X 7 v; A) := '^fj{xj) - / X,'^AjXj - b 

j=i \ j=i 

where A is the Lagrange multiplier and 7 > 0 is a penalty parameter. In a typical iteration of the 
ADMM for solving (II.ip . the following updating procedure is implemented: 



x\'^^ := C^{xi,x^,... ,x%\\^) 

:= argmin,^2g;i.2 X2, ..., A'^) 




Xn 

Afc+1 


argmin,^^g;t<^ £^(x^+\ x^+\ ..., x^+\, A^) 

>■'‘-■1 ( Ef=i - *>) ■ 


(1.2) 


Note that the ADMM (|1.2p minimizes in each iteration the augmented Lagrangian function with re¬ 
spect to xi,...,X 7 v alternatingly in a Gauss-Seidel manner. The ADMM (|1.2I) for solving two-block 
convex minimization problems (i.e., N = 2) has been studied extensively in the literature. The global 
convergence of ADMM (II.2p when N = 2 has been shown in [161 (H] . There are also some recent works 
that study the convergence rate properties of ADMM when N = 2 (see, e.g., [23l[3lllini[2l[22]). 

However, the convergence of multi-block ADMM (jl.2p (we call (jl.2l) multi-block ADMM when N > 3) 
has remained unclear for a long time. Recently, Chen et al. |7] constructed a counterexample to show 
the failure of ADMM (jl.2h when N > 3. Notwithstanding its theoretical convergence assurance, the 
multi-block ADMM (11.21) has been applied very successfully to solve problems with N (N > 3) block 
variables; for example, see [35l[33]. It is thus of great interest to further study sufficient conditions that 
can guarantee the convergence of multi-block ADMM. Some recent works on studying the sufficient 
conditions guaranteeing the convergence of multi-block ADMM are described briefly as follows. Han 
and Yuan [T8| showed that the multi-ADMM (II.2|) converges if all the functions fi ,are strongly 
convex and 7 is restricted to certain region. This condition is relaxed in jS] [28| to allow only N — 1 
functions to be strongly convex and 7 is restricted to certain region. Especially, Lin, Ma and Zhang 
|28j proved the sublinear convergence rate under such conditions. Closely related to [ 81128 ], Cai, Han 
and Yuan | 6 ] and Li, Sun and Toh m proved that for N = 3, convergence of multi-block ADMM 
can be guaranteed under the assumption that only one function among /i, /2 and /s is required to be 
strongly convex, and 7 is restricted in certain region. In addition to strong convexity of / 2 ,...,/Ar, 
by assuming further conditions on the smoothness of the functions and some rank conditions on the 
matrices in the linear constraints, Lin, Ma and Zhang |29j proved the globally linear convergence of 
multi-block ADMM. Note that the above mentioned works all require that (parts of) the objective 
function is strongly convex. Without assuming strong convexity, Hong and Luo [25] studied a variant 


2 









of ADMM (jl.2p with small stepsize in updating the Lagrangian multiplier. Specifically, [25] proposes 
to replace the last equation in (| 1 . 2 p to 



where a > 0 is a small step size. Linear convergence of this variant is proven under the assumption that 
the objective function satisfies certain error bound conditions. However, it is noted that the selection 
of a is in fact bounded by some parameters associated with the error bound conditions to guarantee 


the convergence. Therefore, it might be difficult to choose a in practice. There are also studies on the 
convergence and convergence rate of some other variants of ADMM (11.21) . and we refer the interested 
readers to [2Q1I2I1 (El HIMl Ea ESI for the details of these variants. However, it is observed by many 
researchers that modified versions of ADMM though with convergence guarantee, often perform slower 
than the multi-block ADMM with no convergent guarantee (see |3l|)- Therefore, in this paper, we focus 
on studying the sufficient conditions that guarantee the convergence of the direct extension of ADMM, 
i.e., the multi-block ADMM (II.2|) and studying its convergence rate. 

Our contribution. The main contribution in this paper lies in the following. First, we show that the 
ADMM (II.2p when N > 3 returns an e-optimal solution within 0(l/e^) iterations, with the condition 
that 7 depends on e. Here we do not assume strong convexity of any objective function /j. It should 
be pointed out that our result does not contradict the counterexample proposed in [7] since we apply 
the ADMM (II.2p to an associated perturbed problem of (|l.ip rather than dnD itself. Secondly, we 
show that the ADMM (II.2p when N > 3 returns an e-optimal solution within 0(l/e) iterations under 
the condition that the augmented Lagrangian £.y is a Kurdyka-Lojasiewicz (KL) function |3lll|, V/at is 
Lipschitz continuous, Aat = I, and 7 is sufficiently large. To the best of our knowledge, the convergence 
rate results given in this paper are the first sublinear convergence rate results for the unmodified multi¬ 
block ADMM without assuming any strong convexity of the objective function (note that although 
without assuming strong convexity, |25] studies a variant of the multi-block ADMM). In this sense, the 
results presented in this paper complement with the existing results in the literature. 

Organization. The rest of this paper is organized as follows. In Section[2]we provide some preliminaries 
for our convergence rate analysis. In SectionEl we prove the 0(I/e^) iteration complexity of ADMM () 1 . 2 p 
by introducing an associated problem of (jl.ll) . In Section HI we prove the 0(I/e) iteration complexity 
of ADMM (11.21) with Kurdyka-Lojasiewicz (KL) property. 

2 Preliminaries 

We denote D = Ai x ... x x and the optimal set of (11.11) as H*, and the following assumption 
is made throughout this paper. 

Assumption 2.1 The optimal set H* for problem dnD is non-empty. 

According to the first-order optimality conditions for (|l.ll) . solving dni) is equivalent to finding 
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such that the following holds: 


f (xi-x*)~^(gi(x*)-AjA*)>0, Vxi e A^i, , , 

Aixf + • • • + Aiqx*jq — 6 = 0 , 

for i = 1,2 ,..., A^. 

In this paper, we analyze the iteration complexity of ADMM (jl.2h under two scenarios. The conditions 
of the two scenarios are listed in Tabled) The following assumption is only used in Scenario 2. 

Assumption 2.2 We assume that We also assume that fi has a finite lower bound, i.e., 

fiixi) > /* > —oo for i = 1,2,..., A. Moreover, it is assumed that fi + is a coercive 
function for i = 1,2,..., N — 1, where Ixi denotes the indicator function of Xi, i.e., 

lx (x ) = I 

Furthermore, we assume that C.y is a KL funetion (will he defined later). 


Scenario 

Lipschitz Continuous 

Matrices 

Additional Assumption 

Iteration Complexity 

1 

— 

— 

f < 7 < e 

0(iA") 

2 

V/iv 

II 

7 > y/2L and Assumption 12.21 

0 ( 1 / 6 ) 


Table 1; Two Scenarios Leading to Sublinear Convergence 


Remark 2.3 Some remarks are in order here regarding the conditions in Scenario 2. Note that it 
is not very restrictive to require fi + Ixi to be a coercive function. In fact, many functions used as 
regularization terms ineluding ii-norm, i 2 -norm, i^o-norm for vectors and nuclear norm for matrices 
are all coercive functions; assuming the compactness of Xi also leads to the coerciveness of fi + Ixi- 
Moreover, the assumptions A^ = I and V fN is Lipschitz continuous actually cover many interesting 
applications in practice. For example, many problems arising from machine learning, statistics, image 
processing and so on always have the following structure: 

min /i(xi) H-h /Ar_i(xAr_i) + /Ar(6 - AiXi - An-iXn-i), (2.2) 

where /at denotes a loss function on data fitting, which is usually a smooth function, and fi,..., fN-i 
are regularization terms to promote certain structures of the solution. This problem is usually referred 
as sharing problem (see, e.g., can be reformulated as 

min /i(xi) H-h fN-i{xN-i) + fN{xN) fi 

s.t. AiXi-\ - \-An_ixn-i + XN = b, ^ ■ 

which is in the form of a and can be solved by ADMM (see \2Gf). Note that An = I in (|2.3|) and 
it is very natural to assume that V/at is Lipschitz continuous. Thus the conditions in Scenario 2 are 
satisfied. 


4 











Notations. For simplicity, we use the following notation to denote the stacked vectors or tuples: 


u = 


/ XI \ 

( \ 


1 k 

1 * 


: = 

: = 


\ XN / 

K / 

\ J 


, w = 


= 


u 


= 


We denote by /(u) = /i(xi) + • • • + the objective function of problem (ll.ip : is the indicator 

function of A'; V/ is the gradient of /; ||x|| denotes the Euclidean norm of x. 

In our analysis, the following two well-known identities are used frequently, 

{wi - W2)^{w3 - W 4 ) = ^ (||rci - rt;4|p - lltci - tcsip)^ (liras - ■w;2|P - - w'2|P) , (2.4) 


{wi - W2)~'^{w3 - wi) = - (||ra2 - rasll^ - lltai -'u;2||^ - ||rai - rasll^) . 


(2.5) 


3 Iteration Complexity of ADMM: Associated Perturbation 


In this section, we prove the 0(I/e^) iteration complexity of ADMM (jl.2h under the conditions in 
Scenario 1 of Table [TJ Indeed, given e > 0 sufficiently small and initial point u^, we introduce an 
associated perturbed problem of (HI]), i.e.. 


min fi(xi) + f 2 (x 2 )-I - h fN(xjv) 

s.t. AiXi + A 2 X 2 -\ - \-AMXN = h (3.1) 

Xj G A), f = 1,..., iV, 

where fi{xi) = fi{xi) + ^ \\AiXi — for f = 2,... ,N, and fi = e(N — 2){N + 1). Note /* are 

not necessarily strongly convex. We prove that the ADMM (11.21) for associated perturbed problem 
dsn returns an e-optimal solution of the original problem (|l.ll) . in terms of both objective value and 
constraint violation, within 0(l/e^) iterations. 

The ADMM for solving ()3.ip can be summarized as (note that some constant terms in the subproblems 
are discarded): 


X 


k+l 

1 


^k+1 



'1 

argmin/i(xi) - 

Xi^Xi ^ 


i—1 


N 


'^AjX^~^^+AiXi+ ^ AjX^ 
j=l j=i+l 






(3.2) 


i = 2,...,N, (3.3) 
(3.4) 
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The first-order optimality conditions for (j3.2p - ()3.3l) are given respectively by £ Xi and 


N 


{xi - + 7^7 ( Mx\'^^ + ^ AjX^j - b 

i=2 


> 0 , 




/ i N > 

9i{Xi^^) + g-A]Ai - AJ + 7^7 + Y ~ ^ 


, j=i 


j=i+l 


(3.5) 

> 0 , 

(3.6) 


hold for any Xi G T) and gi G dfi, a subgradient of /,, for i = 1,2,... , A^. Moreover, by combining with 
(j3.4l) . (I3.5p - (|3.6I) can be rewritten as 


(xi - x‘+‘)T 


(x,-x*+‘r 


N 


5i(x^1)-x47a'=+i+7^7 


vi=2 


> 0 , 


/ N 

9i{xY) + M7 a [xY -Xi^ - AJ + 7x47 Y ~ 


<j=i+l 


(3.7) 
> 0. (3.8) 


Lemma 3.1 Let ..., x^^, A^"*"^) € Ll be generated by the ADMM (11.21) from given (x^, ■ ■ ■, x^, A^). 

For any u* = (x^,X 2 ,... ,x’^) G LI* and A G R^, it holds true under conditions in Seenario 1 that 


/(«*)-/(«"+!) + 


/ xl-xt+^ \ 

^2 d.2 


k+l 
N 

\ A-A^+i / 


( -x47A^+^ \ 

-xljA'^+i 


-x47A"+i 

V Y=l - b J 


+ 


27 


A-A'^ 


A - A^+i 


+ 


e(iV-2)(iV + l) 


N 


Y\Y^i~ 


0||2 


i=2 


N-1 




2=1 


N 


Y^Y+ Y ^Yj-b 

j=l j=i+l 


N 


Yyy Y yY-^ 

j=l j=i+l 


> 0 . 


(3.9) 


Proof. Note that combining (I3.7I) - (I3.8I) yields 


/ xi-x^+^ \ 

X2 - xY 


\ XN - 


X+1 

''TV 


( gi(xt+i)-x47A"+i \ / 

52(2:2+^) -717 A^+l 


+ 


\ 9n{xY) - / 


0 

X+i 


\ 


gAj (x42X2 ^ - x42X^) 

V ^Yi^NxY - ^NX%) J 


+ H 


> 0 , 


rpk .^^+1 

vL2 *^2 




(3.10) 


6 







































where H G R(Ei=i "■0 jg defined as follow: 


( -fAj A2 'yAj As 
0 'jAJAs 


H := 


0 

V 0 


-fAj Am \ 
-fAjAM 

iAl^_^An 

0 / 


The key step in our proof is to bound the following terms 

N 


{Xi - I X] I ’ * = 1, 2,... , iv - 1. 

. j=i+l 


For i = 1, 2,..., — 1, we have, 

(xt - i?+‘)^a7 ( E 

\j=i+i 



\1 

T 

[( 

IJ 


[1 


N 




j=l j=i+l 

i—1 



N 




k 
3-^j 


j=i+l 


N 


< 


N 


j=l j=i+l 


j=l j=i+l 

2 


N 


Y^y^^+ Y Ajx^+^-b 

j=l j=i+l 


1 

+ 2 


i—1 


N 


i=i 


j=* 


where in the second equality we applied the identity (|2.4p . 
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Therefore, we have 


( XX-x\^^ \ 

X2 - 


T / -^AIA2 lAjA^ 
0 7^ J A 3 


< 


\ XN - / 

N-1 

?E 


-fAj An \ 

-fAjAN 


+ 


i=l 

1 


0 

V 0 

i N 

^^3^]+ AjX^j-b 


/ _ ^^+1 

■ *^2 o 


N 


i=i 


j=i+l 


0 • • • ^A^_^An \ x% -X 

0 ••• 0 / 

^ i Af ^ 

z] ^ 

j=l j=i+l 

2 



2 0 , 

i-1 N 

Afc+i _ 

-iE 



i=2 

j=i j=i 


(3.11) 


Combining (|3.4I) . p.lOl) and (13.lip , it holds for any A € that 


X2 - X^'^^ 


XN - x’p'^ 

V A-A^+1 


N 


52(x^+1)-AITa"+i 
gN{x%+^) - AlX>^+^ 

V EliA^x>l^'-b / 


+ - fA-A^+^'^ 

7 ^ 


i-A' 


+/i ^ (xi - x^+^^ A]Ai (^x^+^ - + 


i=2 

N-1 


1 


A^+i _ 


N-l 


+ iE 


i=2 


i—1 


N 


+ 1 E 


2=1 


AT 


I] AjX^j-b 
j=l j=i+l 


N 


Y +Y ~ ^ 

j=i j=i 

2 


j=l j=i+l 


> 0 . 

Using (12.51) . we have 

^ A - A'=+^" ^ 


(3.12) 
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and 


A^+i - A^ ) + 


g[x,-x^^^y AjAjl^x^+^-x^) 


1 


A^+i _ A* 


27 


A-A^ 


A - A 


k+l 


= ^ ( WAiXi - - 


< \\AiXi - Aix'^if - 


A — A 


A .rY^ . _ A + 1 

J-Ai U. I J-Ai Jj ■ 


A 'T _ 4 


Letting u = u* in f|3.12P ' and invoking the convexity of fi that 

/*«) - fi(xY) > « - xY)~^ffi(Y^)’ i = l,2,...,N 









































































and 


Af-l 

2—1 

N 

^ iV-1 

N 

iE 



W 

C-ICN 

II 


i=2 

i=i 

j=i 

i=2 

j=i 


we obtain, 


i=2 


( xl-x’l+^ \ 


f{u*) - /(n"+i) + 


Xo — X 


k+1 


1 


A-A^ 


* /c+l 

Xn ~ 

V A-A^+1 

2 


( \ 




Eii AiJT' - b / 

N 

, 0||2 


A - A' 


k+1 


i=2 




AjX* — AiX^~^^ 


N-l 




2=1 


N 


j=l j=i+l 


N 


2 \ 


Y,AjXj+ Y1 

j=l j=i+l 


N 


+ ^(jV + l)(jV-2) ^ 

i=2 

> 0 . 


This together with the facts that /i = €{N — 2){N + 1) and 7 < e implies that 

N „ Af 


7(iV + l)(iV-2) 


^ \\AjX* - Ajx] 
i=2 


k+1 


2 

i=2 


< 0 , 


which further implies the desired inequality ()3.9p . 


□ 


Now we are ready to prove the 0(l/e^) iteration complexity of the ADMM for (ll.lh in an ergodic case. 

Theorem 3.2 Let ..., A^+^) £ Ll be generated by ADMM (I3.2l) - (l3.4p from given 

(x 2 , ■ ■ ■, x^, A^). For any integer t > 0, let u* = (x^, x^j ■ ■ ■ > 0 ,‘nd A* be defined as 


xf = 


t + 1 


fc =0 


,fc+i 


i = l, 2 ,...,iV, A* = 




t + 1 ^ 
k=0 
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For any {u*,X*) G O*, by defining p := ||A*|| + 1, it holds in Scenario 1 that, 


0 < /(«*)-/(«*) +p 


N 


Aixj - b 


i=l 


< 


p^ + 


+ 


7 


'y{t + 1) 2(t + 1) 


N-l 

E 

2=1 


N 


j=i+l 


+ 


e{N-2){N + l) 


N 


Y. ~ 


i-i'i 


0||2 


i=2 


This also implies that when t = 0{l/e^), tfi = {x\,x\-: ■ ■ ■ i^tv) e-optimal solution to the original 
problem (|l.ll) . i.e., both the error of the objective function value and the residual of the equality constraint 
satisfy that 


\fY)-f{u*)\=0{e), and 


N 

Y ^ 

2 = 1 


0 ( 6 ). 


(3.13) 


Proof. Because G it holds that (ti^,A*) G 11 for all t > 0. By Lemma l3.ll and invoking the 

convexity of function /(•), we have 


f{u*) - fY) + A"^ Aixl - bj 


T 


= f{u*)-fY) + 


xl- xl \ 


ry>i 

^2 ^2 


—t 


Xn — Xj^ 


\ A-A‘ y 



-Til A* 


-AY' 


\ Ef=i Axl - b 


> 


1 * 
— y 

4 - 1 


^ + 1 


A:=0 


f(u*)-fYY + 


( x\-xY \ 

rp* _ rpk + 1 

X2 ^2 


T 


k+l 
N 

\ A-A^+^ / 


/ -Aj \ 


-Tl^A^+i 

Eii - f- 


> 


1 ‘ 

^E 


t + 1 


fc=0 L 
Af-l 


1 

27 


A-A 


fc+i 


A-A^ 


e(iV-2)(iV + l) 


N 


Y. \\AiX*i - Aix\ 


0||2 


i=2 


+?E 


> — 


i=l 

1 


i N 

2 

i N 

M 

YAjX*+ Y AjxY-b 

- 

YAjxY Y AjxYb 


j=l j=i+l 


j=i j='i+i 

/J 


27(1 + 1) 


-A°ir- 


7 


N-l 

2(t + 1) E 


N 


YAjX*+ Y AYj-b 

j=l j=i+i 


€{N-2){N + 1) 


N 


Y \\AiX* - Aix: 


P||2 


t^i I 


(3.14) 


i=2 
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Note that this inequality holds for all A G R^. From the optimality condition (12.ip we obtain 


0 > f{u*) - f{u^) + {X*y Axl - . 


Moreover, since p := ||A*|| + 1, by applying Cauchy-Schwarz inequality, we obtain 


0< f{u)- f{u*)+p 


N 




1=1 


(3.15) 


By setting A = —p Aix\ — b'^ / ^ix* — b in (I3.14h . and noting that ||A|| = p, we obtain 


/(u‘) - f{u*)+p 


N 


^ 


2=1 


(3.16) 


< ^ 7 


Af-l 

E 


7 (t + 1 ) 2{t + 1 ) 


N 


j=i+l 


+ 


e(iV- 2 )(iV + l) 


N 




„ 0||2 


i=2 


When t = 0(l/e^), and together with the condition that | < 7 < e, we have 


p2 _|_ II \0||2 


+ 


7 


7 (t + 1 ) 2{t + 1 ) 


N-l 

E 


N 


j=i+l 


+ ~ ^ E = OW.(3.17) 

i=2 


We now define the function 

N 

u(0 = min{/(u)| ^ AjXj - 6 = ^, Xj G 7^;, i = 1, 2,... ,N}. 

2=1 

It is easy to verify that v is convex, u(0) = f{u*), and A* G dv{0). Therefore, from the convexity of v, 
it holds that 

^;(O>^(0) + (A*,O>/(^*)-||A*||||ei|. (3.18) 

_ N 

Let ^ Aix\ — b, we have /(u*) > v{^). Therefore, combining (I3.15p . (13.171) and (|3.18l) . we get 
2=1 

^ P^ + ||A°|p 7 

- 7(t + l) 2(t + l)^ 

< Ce-pim, 

which, by using p = ||A*|| + 1, yields, 

N 

\\J2M-b\\ = m\<Ce. (3.19) 


N 

Y 

j=i+l 


_l_ e(jV - 2){N + 1) 

^ i=2 


\AiX*-Aix'i\\ -pIICII 
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Moreover, by combining (I3.15P and (I3.19p . one obtains that 

- pCe < -pll^ll < /(h*) - f{u*) < (1 - p)Ce. (3.20) 

Finally, we note that (j3.19p . (j3.20p imply (I3.13p . □ 


4 Iteration Complexity of ADMM: Kurdyka-Lojasiewicz Property 


In this section, we prove an 0(l/e) iteration complexity of ADMM (ll.2p under the conditions in Sce¬ 
nario 2 of Tabled) Indeed, we prove that the ADMM for the original problem (11.11) returns an e-optimal 
solution within 0(l/e) iterations in Scenario 2. 

Under the conditions in Scenario 2, the multi-block ADMM ()1.2p for solving (jl.ip can be rewritten as: 

2 


:= argmin/i(3:i)^ 

xi&Xi 2 

A "-)—1 • r / \ 0 ^ 

:= aigmm +- 

3.1 e Xi 


N-l 


1 


Aixi -I- ^ AjXj + x% — b -A 


i=2 


7 


(4.1) 


i—1 


N-l 


+ AiXi -b ^2 +x% -b -A 


i=i 


j=i+l 


7 




^k+l 

yk+1 


argmin/Ar(xAr) -b ^ 


N-l 

+ XN -b -A* 


i=i 


A'^ - 7 [Aix\+^ + + • • • + - b^ . 


(4.2) 

(4.3) 

(4.4) 


The first-order optimality conditions for (I4.1l) - (l4.3p are given respectively by x\~^^ G A), i = 1,..., A" — 1, 
and 


N-l 


gi{x\+^) -AjX^ + ^Aj A^x\+^ + ^ A^x] + 4 _ 5 = 0, 


1=2 


N-l 


5,(xf+i) - A] A^ + 747 ^ + 4-6=0, 


+■=1 


j=i+l 


'N-l 


V/7v(a:^+^) - a'' -b 7 I X] ^ | 

,1=1 


(4.5) 

(4.6) 

(4.7) 


where Qi € d {fi + 1+) is a subgradient of /j -b Ixi for i = 1,2,..., A — 1. Moreover, by combining with 
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(BaD, (BaD-dizi) can be rewritten as 


(),(if+") - Aj A‘+" + ')j = (4-8) 

9.(i*+‘)-A7a*+>+ 7A7 I X; Al,(iJ-iX‘) + (i7-4'"')| =0. (4.9) 

\i = i+l / 

V/iv(4^^) - = 0. (4.10) 

Note that in Scenario 2 we require that is a Kurdyka-Lojasiewicz (KL) function. Let us first 
introduce the notion of the KL function and the KL property, which can be found, e.g., in [311]. We 
denote dist(a;,S') := inf{||y — x|| : y £ S} as the distance from x to S. Let t] S (0,+oo]. We further 
denote to be the class of all concave and continuous functions (p : [0, y) —)■ R+ satisfying the following 
conditions: 

1. (^(0) = 0; 

2. if is on (0, r]) and continuous at 0; 

3. for all s G (0,r/) : ip'{s) > 0. 

Definition 4.1 Let f : Ll ^ (— oo,+oo] be proper and lower semicontinuous. 

1. The function f has Kurdyka-Lojasiewicz (KL) property at wq G {w £ Ll : df{w) ^ 0} if there 
exists T] £ (0, +oo], a neighbourhood Wq of wq and a function y? G such that for all 

tDo G W n {u; G n : f{w) < f{wo) < f{w) + 77 } , 

the following inequality holds, 


T'{f{wo) - fiwo)) dist(0, df{wo)) > 1. (4.11) 

2. The function f is a KL function if f satisfies the KL property at each point of Lin {df{w) 0}. 

Remark 4.1 It is important to remark that most convex functions from practical applications satisfy the 
KL property; see Section 5.1 o/EF- In fact, convex functions that do not satisfy the KL property exist (see 
m for a counterexample) but they are rare and difficult to construct. Indeed, will be a KL function 
if each fi satisfies growth condition, or uniform convexity, or they are general convex semialgebraic or 
real analytic functions. We refer the interested readers to m and ^ for more information. 

The following result, which is called uniformized KL property, is from Lemma 6 of [1]. 

Lemma 4.2 [Lemma 6 Let LI be a compact set and f : R” —^ (— 00 , 00 ] be a proper and lower 
semi-continuous function. Assume that f is constant on 11 and satisfies the KL property at each point 
of LI. Then, there exists e > 0, rj > 0 and (p £ such that for all u in LI and all u in the intersection: 

{u £ R” : dist(n, H) < e} H {n G R” : f{u) < f{u) < f{u) + 77 } , 
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the following inequality holds, 


ifiu) - f{u)) dist (0, df{u)) > 1. 

We now give a formal definition of the limit point set. Let the sequence be a 

sequence generated by the multi-ADMM (11.21) from a starting point = [x^,... A*^). The set of 

all limit points is denoted by i.e., 

Xl{w^) = |u) G X • • • X X R^ : 3 an infinite sequence such that w^'- —)• ti) as Z —)• oo| , 

In the following we present the main results in this section. Specifically, Theorem 14.,31 gives the conver¬ 
gence of the multi-ADMM (|1.2p . and we include its proof in the Appendix. Theorem 14.51 shows that 
the whole sequence generated by the multi-ADMM (jl.2|) converges. 


Theorem 4.3 Under the eonditions in Scenario 2 of TableUl then: 

1. Q{w^) is a non-empty set, and any point in ^{w^) is a stationary point of C.y{xi,... ,X]y, A); 

2. Q{w^) is a eompaet and connected set; 

3. The function Cj{xi ,..., xat, A) is finite and eonstant on Q(w^). 


Remark 4.4 In Theorem 
Theorem \4-51 (see next). 




we do not require to he a KL function, which is only required in 


Theorem 4.5 Suppose that C-f{xi,... ,xn, A) is a KL function. Let the sequence = {^x \,..., A^) 

be generated by the multi-block ADMM (|1.2p . Let w* = (x|,..., A*) G the sequence = 

{x\,... ,x^, A^) has a finite length, i.e., 

oo /N—1 \ 

^ HA,4 - A,x4'\\ + ||x^ -4+‘|| + IIA" - A''+>|| < a, (4.12) 

k=0 V i=l J 

where the eonstant G is given by 


G:=2 ||AiX° - AiX^^II -h ||x?^ - x]vl| + ||A° - A^||^ -h - ^y{w*)) , 


and 


N-l 


N-l 


> 0 , 


M = max j 7 ^ ||a7 , - + 1 + X] 

V i=l i=l 

and the whole sequence (Aix^, A 2 X 2 ,..., AAr_ix^_^, x^, A^) converges to (Aix*,..., Aat-ix^. -D ■^*) ■ 
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Proof. The proof of this theorem is almost identical to the proof of Theorem 1 in [1], by utilizing the 
uniformized KL property lLemma l4.2p . and the facts that is compact, Cy{w) is constant (proved 

in Theorem 14.3p . with function T replaced by and some other minor changes. We thus omit the 
proof for succinctness. □ 

Based on Theorem 14.51 we prove a key lemma for analyzing the iteration complexity for the ADMM. 


Lemma 4.6 Let ..., € LI be generated by the multi-ADMM (I4.ip - (I4.4I) (or 

equivalently, ()1.2I ) ) from given (x^, ... , x^, A^). For any u* = (x|, x^,..., xj^) G LI* and A G it 
holds in Scenario 2 that 


fiu*) - /(u^+i) + 


/ x^-x^+^ \ 

o.* _ rpk+1 

^2 X2 


fc+1 
N 

\ A-A^+1 / 


/ 


-AjX^ + l 


\ 


+ 


+ 


7 


1 


iV-l 


Aixl + ^ AiXi +x% -b 


i=2 


_Xk+l 

Af-1 

Aix\ + ^ - b 


i=2 


A-A' 


A-A 


k+l 




> 0 , 

where D is a constant. 


Xn xj^ 


(4.13) 


Proof. Note that combining (I4.9I) - (I4.10I) yields 


/ xi - 


X 


k+l \ T 


X2 - X2'^^ 


V xw - x’p'^ J 


( gi{x\-^^)- AjX^+^ \ 
g2{x^2^^)-AjX^^^ 

\ v/w(4+')-a^+i ! 


( 'yAj A2 'yAj A3 
0 7-4 J A 3 


+ 


0 

V 0 


0 

0 


> 0 , 


7^7 \ 

7A^ 

7A^_i 

0 / 


/ Xn — 


„k+l 


^k+1 


(4.14) 


where Xj G T) and gi G d{fi + is a subgradient of fi + l;ri for i = 1, 2,... , N — 1. 

The key step in our proof is to bound the following terms 

(^Xi-xf+^) AJ I Aj{x’^-x’]+^) + {x%-x^+^)\ , i = l,2,...,N-1. 
\j=i+i J 
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For the first term, we have (similar to Lemma l3.ll) 




N-l 




i=2 


< 


N-l 

Aixi + 'y ^ ^ 

i=2 


N-l 


Aixi + ^ ^ 

i=2 


+ ^||A'=+^-A'=f. 


For f = 2,3,..., — 1, we have, 


N-l 


< 


< 


(X, - xi+VAj Y. + An - xi*') 

J 

AiXi - AiXi~^^ 


j=i+l 

N-l 


Y1 

j=i+l 


+ 


rp^ _ ^^+1 

Xn xj^ 


AiXi - Aix^^^ 


N-l 


i=i 


k+l 

3 


•^N -^N 


Therefore, 


/ xi-x^+i \ 
X2 - 


\XN - J 


k+l ^ T / 7^7^2 7^7^3 


0 7-47^3 


/ - 


k ^k+1 


0 

V 0 


2 -^2 


< 


7 


w-i 


A\X\ + ^ ^ AiX^ + x^ — b 


i=2 


■ lAj \ 

■ 7Af7 

• • • 7^17-1 

0 / 

N-l 

Aixi + ^ - b 


h 

1 /yj f\j 

\ -rjv X 


N 


/N-l 


+7 'Y \ AiXi - AiX. 


k+l 


. i=2 


'N-l 


Y, \\Aix\-Aix\ 


i=2 

k+l 


+ 


27 


. i=l 


+ 


Xn Xjy 


Afc+1 _ 

(4.15) 
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Combining (|4.4I) . (j4.14l) and (I4.15p . it holds for any A € RP that 


( \ / gi{x\+^) - Al\^+^ \ 

g2{xl+^)-Al\^+^ 


X2 - 


XN - 

V A-A^+1 


+ - 


N-l 


V/k(iS/") - A‘+i 
EAT' Alix‘+‘ + x5+‘ - 6 / 

2 


+ - ( A - A'* 

7 


fc+i\ /\fc+i 


A'^+^ - A' 


AiXi + 'y ^ AiX^ + x\i — b 


N-l 


Tlixi + ^ Aix^+^ + - b 


/N-l 


. i=2 


i=2 

i=2 



'N-l 


- 

4 -T*- A 



+ 

k k-\-l 

Xn Xj^ 


_ ^=1 




+ 


27 


A'^+i _ 


> 0 . 

Using (12.Sp . we have 


(4.16) 




7 


A/c+i _ A« + 


27 


A^+i _ A* 


27 


A-A^ 


A - A 


k+l 


Letting u = u* m. (14.161) . and invoking the convexity of /,, we obtain 


/ xl-x\+^ \ 


T 


f{u*) - /(n^+1) + 


Xr, — X 


k+l 


X AT X 


k+l 


N -^N 

V x-x^+^ 


( -AjX^+^ 


-AjX’^~^^ 


+ 


-I 


N-l 


AiXi + ^ ^ AiX^ + x^ — b 


i=2 


_A*^+i 

V + 4^' - b j 

N-l 

Aixl + ^ Axi~^^ + x’p~^ - b 

i=2 


27 




, i=2 



'N-l 


' 

W A ■ T** A • '7*^41 


1 Att^ 4 

+ 

^k ^k+1 

Xjy XjY 


/ 

_ i=\ 




> 0 . 


A-A' 


A - A^+^ 


From Theorem 14.51 we know that the whole sequence i^Aix\, A 2 X 2 , • • •, An-ix^_^,x^, A^) converges to 
(^Aix \,..., Afq-ix*^_T^, x*^, A*). Therefore, there exists a constant H > 0 such that 


AiX^ — AiX- 


k+l 


< A 


for any fc > 0 and any i = 2, 3,... , — 1. This implies (I4.13p . 


(4.17) 

□ 


Now, we are ready to prove the 0(l/e) iteration complexity of the multi-block ADMM for (II.IF 
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Theorem 4.7 Let , ■ ■ ■, £ Ll be generated by ADMM (I4.1l) - (j4.4p from given 

{x 2 , ■ ■ ■, x^, A^). For any integer t > 0, let u* = {x\,x \,..., x^) and A* be defined as 


x] = 




4+‘, i = l,2,...,iV, A‘ = 


t+l^ -* ’ 

k=0 


t + 1 


fc =0 


fc+i 


For any {u*,X*) G O*, by defining p := ||A*|| + 1, it holds in Scenario 2 that, 


0 < f{u) - f{u*) + p 


N 


< 


P^ + 


+ 


'y(t + 1) 2(t + 1) 


Aixj - b 

^ ^i(x° — X*) + 


i=2 


+ 


-jPG 
i + 1 


Note this also implies that when t = 0(l/e), = {x\, x\,..., xfj,}) is an e-optimal solution to the 

original problem (ll.lh . i.e., both the error of the objective function value and the residual of the equality 
constraint satisfy that 


\f{u^)-f{u*)\=0{e), and 


N 

Y ^ 

i=l 


0 ( 6 ). 


(4.18) 


Proof. Because (u^,A^) G it holds that (u*,A*) G Ll for all t > 0. By Lemma 14.61 and invoking the 
convexity of function /(•), we have 


/(«*) - /(«*) + ^ ^ + x% -l)j 


=fiu*) - f{u^) + 


/ x^-xi \ 

T 

/y» ryti 

X2 X2 




Xn — Xn 


\ A-E y 



-24[A* 


>- 


1 * 
—y 

4- 1 


t + 1 


k=0 


-A* 

V ^ +x%-b j 

( xl-xY Y ( 


f{u*) - /(n^+1) + 


>- 


1 ‘ 
—y 

4- 1 


t + 1 


A;=0 L 


1 

27 


A-A 


fc+i 


rn- _ .^^+1 

X2 X2 


Xn Xjy 

V A-A^+^ 

2 




_Afc+i 


A-A^ 


+ 


7 


N-l 


Aixl + Y - b 


i=2 


^j_^D{N-2)(^Yx’l-AixY 

N-l 

^diX^ + E ■^i^i y ^TV ^ 


+ 


Xn Xn 


i=2 
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1 


7 


2 


> - 




27 (i + l)" " 2(i + l) 

t /N-l 


N-l 


A\x\ + ^ ^ AiX^ + x^ — b 


i=2 


k=0 \i=l 


> - 


t + 
1 


^ - A x?+^ 


+ 


k _ k+1 

Xn xj^ 


27 (t + l) 


|A-AO||^- 


7 


2(i + l) 


Af-l 


Aix\ + ^ + x% -b 


i=2 


'yDG{N - 2) 

tTl ’ 


where the last inequality holds due to Theorem 14.51 Note that this inequality holds for all A € R^. 
From the optimal condition (12.11) we obtain 


0 > f{u*) — f{u^) + (A*)"*" ^ 


Moreover, since p := ||A*|| + 1, ||A — Ao|P < 2{p^ + HA'^p) for all ||A|| < p, and AiX* + = 6, we 

obtain 


0 < f{u*) + p 


^ p^ + ||A°f ^ 


N-l 


7(t + 1) 2(t + 1) 

When t = 0(l/e), we have 


^ ^ AiX^ + — b 

^ — a:*) + (x^ — x|^) 


i=2 


+ 


7T»G(iV - 2) 
^ + 1 


(4.19) 


p2 _|_ II \0||2 


+ 


7(t + 1) 2[t + 1) 


N-l 


y~l — ^i) + i^N ~ ^*n) 


i=2 


,DGiN-2)^ 

t + l ^ ^ 


(4.20) 


By the same argument as in the proof for Theorem 13.21 (I4.18P follows from (I4.20p . 


□ 
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A Proof of Theorem 14.3 


We first prove a key lemma in the proof of Theorem 14.31 
Lemma A.l The following holds in Scenario 2, 

1. The iterative gap of dual variable can he bounded by that of primal variable, i.e., 

V/iv(4^') = 

and 




< 




where L satisfies that 


l|V/Ar(x) - V/Ar(y)|| <L\\x- y\\ . 

2. The augmented Lagrangian L-y has a sufficient decrease in each iteration, i.e., 

r ('r^ 'r^ — r 

•'-'7 ) • • • ) •^TV-l-l) ^ J , . . . , A J 


> 


7^ - 2L 

27(1+ L2) 


2 /N-i 


^ llAiXi - Aix: 


fc-i-i 


. i=l 


+ 


_ ^k+1 
Xn xjy 


-7 




3. The augmented Lagrangian C.y{w^) is uniformly lower bounded, and it holds true that 

00 /N—1 „ o 


(A.l) 

(A.2) 


(A.3) 




fc=0 \ i=l 


+ 


Xjy XJV 


-7 


A^+i _ A* 


+ ? (£>-)-V) 


72 - 2 L 2 


where L* is the uniformly lower hound of Cy{w^), and hence 


/N-l 


lim ^ - Aix\ 

k^oo ' ^ 11 


/c+1 


. i=l 


+ 


_ k+l 

XJV Xjy 


-7 


yk _ yk+l 


= 0 . 


(A.4) 

(A.5) 


Moreover, { {x\, X 2 ,..., x^, A^) : A: = 0,1,...} is a bounded sequence. 
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4-- There exists a upper bound for a subgradient of augmented Lagrangian in each iteration. Indeed, 
we define 


m-i 


\ i=i 


N-l 

M ^ j I _ I ^ ^ _ ^t+1) 

j=i+l 


and 


/N-l 


N-l 


^ E - 6 , = h-Y,A, 


k+l k+1 


X; — X 


N 


. i=l 


i=l 


for each positive integer k, and i = 1,2,..., N. Then ,..., G dC.y{w^~^^). 

Moreover, it holds that 

) • • •) Rn ■> R\ J 


N 


< 


E « 


k-\-l 

'i 


i=l 


+ 




m-i 




k+l 


. i=l 


+ 


4 - 


+ 




, VA: > 0, (A.6) 


where M is a constant defined as 


M = max 



Af-l 


i + i + E 

' i=l 


A 


T 


> 0 . 


(A.7) 


Proof of Lemma lA.ll 


1. (lA.ip follows from (l4.1Up directly. Then we consider the inequality pA.2p . It follows from (lA.ip 
and the fact that V/at is Lipschitz continuous with L that 


Afc+i _ ^k 

2. Multiply both sides of 


< L" 


Xjy XjV 


V/(x^+^) - Vf{x%) 
by x^ — x\'^^, and invoking the convexity of fi, we have 


/ \ T 

/ iV-l \ ■ 


5i(x^+^) -AjX^ + 7 A 7 Aix^+i + ^ Ajx’f + x%-b] 


V ^=2 /. 


</(4) - /(^^') - 

+ 7 (^Aix\ - Aix^+^^ I Aix^^^ + ^ AjX^ + x% - b 


i=2 
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2 


c-lo 

+ 

1 

W-l 



y~] AjXj +X% -b 

-i 

Aix’i - Aix5^+^ 


1=1 

/ 



- 1 /(x^+1) - + I 


N-l 

+ X] + 3 ;^ - ft 

J=2 


=£^(x^,..., x^, A^) - x^, ...,x%,X^)-j ^ix^ - Aixf 

where the second equality holds due to (12.5p . 

For i = 2,3,... , N, we can derive from (I4.6p and (I4.7P that 

C, (x^+\ ...,x^+S x,^ ..., A^) - (xt+S ..., xf+S xf+i,..., A^ 

> - AiXi - Aix’l'^^ 


Summing pA.Sp and pA.9p over i = 2,..., N, we have 

(^x5^, ...,x%, A^) - (xi+S ..., x^+\ A 
Af-l 

- ~ 


i=l 


+ 


7 


„fc _ k+l 

Xn Xjy 


On the other hand, it follows from ^A.ip that 

C ( .j .^+1 rpk+1 \k\ _ p f k+l „fc+l 

1 Xi ,...,Xj,j ,A I I ‘‘'I ) • • • ) ’ 


yk _ yk+1 


L2 

>- 

7 


^k _ k+l 
Xn Xn 


Combining pA.lOp and (lA.lip yields 

(x\, ...,x%, A^) - (x^+\ ..., x^+\ A^+i) 


N-l 


- 9 X] 


fc+1 


i=l 

N-l 


2 7^ - 2L2 

+ 


27 


ryk ^k + 1 

Xn Xn 


y~] Uixf - Aix, 


k+l 


i=l 

N-l 


2 72 - 2L2 

+ 


27(1 + L2) 


„k _ k+l 
Xn Xn 


2 L2(72 - 2L2) 

^ 27(1 + L2) 


(A.8) 


(A.9) 


(A.IO) 


(A.ll) 


fc+i 


Xn — Xn 




> 


2=1 

72 - 2 L 2 


m-i 


fc+i 


+ 


72 - 2 L 2 


27(1+ L2) 


Y1 ~ 


fc +1 


+ 


27 ( 1 + F 2 ) 

where the last inequality holds due to the fact that 

7> 7'-2A2 


„fc _ .j.fc+1 
Xn Xn 


„fc _ „fc+i 
Xn Xn 


+ 


Afc _ ^fc+i 


+ 


Afc _ ^fc+i 


(A.12) 


2 - 27 ( 1 + L 2 )' 
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3. Note that 



fc+l \k-\-l\ 

. , , /A 1 




N-l 


/ \ 


Af-1 

E 


(x’^+\Y.AiX^+^ + x^+^-b] 

EAiixE + 4 ^' 

2=1 


\ i=l / 


2=1 


It follows from (lA.ll) and the fact that V/w is Lipschitz continuous with constant L that, 



N-l 


In [b-J2 


fc+i 


i=l 


N-l 


< 


/iv(4^') + (v/w(x^+'), U - E ^ 


fc+i 

N 


i=l 


L 

+ 2 


N-l 

i=l 


k+l k+1 


iXi - X 


N 


N-l 


L 


= /«(4+‘) - V/„(x*+‘), Ajx‘+‘ + x^+‘ - I, ) + - 


2=1 


N-l 


E AiX^^^ + - 


N-l 


L 


Mxtt') - { A‘+‘, A.x*+‘ + xJ+‘ - 6 ) + - 


2=1 


7V-1 

2=1 


2 = 1 


k-\-l I ^Al+l 




x'k^^ - b 


This implies that there exists L* > — 00 , such that 




k+l \fc+l'i 


N-l 


N-l 


> + fN {b - ^ Aix’l^^ 

i=l V i=l / 

> L*, 


+ 


7-L 


N-l 


E Ax'^^^ + - b 


2=1 


(A.13) 


where the last inequality holds since 7 > L and inf;^. /j > /* for i = 1 , 2 ,..., A^. 
Therefore, it directly follows from (IA.3P and 7 > \/ 2 L that. 


7 


2 - 2L2 


K /N-l 


27(1+ L2) 


E E + W^^N^ - ^Nf + < C,^{w^) - L*. 


k=0 \i=l 
Letting K ^ 00 , we have 


72 - 2L2 
27(1+ L 2 ) 


E E 


kll2 


- x%f + ||A^+i - AkP < C^{w^) - L*, 


k=0 \i=l 


which implies (IA.4P and (lA.Sp . 

It also follows from (IA.I3p . (IA.3h and 7 > ^/2L that C-^{w^) — This 

implies that { [xi,X 2 , ■ ■ ■, a^w-i) : fc = 0 , 1 ,...} is a bounded sequence by using the coerciveness 
of fi + IxiA = 1, 2,... , A^ — 1. The boundedness of (x^, A^) can be obtained by using p4.4l) . (jA.2p 
and (jA.5[) . 
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4. From the definition of it is clear that for i = — 1, 


and 


and 


- Aj + 7417 e 


Af-l 




2=1 

where Qi & d {fi + Ia'J for z = 1, 2,... , — 1. Since (|4.8p . (14.9|) . and (I4.1U|) imply that 

9,(7+') - Aj A*+> = -~,Aj Aj(7 - i‘+‘) + (i^ - x5+‘) j , 

9i(x‘+‘) - a7A*+> = -lAj ( ■£ A,.(7 - x7‘) + - i7‘) I . 


Ai=*+i 


V/w(x"+i)-A'=+i = 0, 


(A.14) 

(A.15) 

(A.16) 


we have 


^7V-l 


N-1 


i?f+i = 7717 Aix'^^^ + - 6 - 7^7 E “ ^ 4 ^) 1 e 

V i=l / V=*+l 

Af-1 

4^' = ^ - E ^*4' - ^4' = 


2=1 


for z = 1, 2,... , A^ — 1. This implies that ..., G dC^{w^~^^). 

We now need to estimate the norms of 1 < i < N — 1 and and Ry It holds true that, 





/ ZV-l 
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ZV-l 
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-^jXj X±jXj 
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Xn Xn 
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2 = 1 


< 7 


'n-1 

E 

.i=i 


fc+i 
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_ ™^+i 

Xn Xy 
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A. 
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Afc _ ^k+i 


and 
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fc+i 

N 


< 7 


Af-l 


E - b 


2=1 


Afc _ ^fc+i 
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and 


R 


k+l 


N-l 




i=l 


1 

Afc _ ^fc+i 

7 



Therefore, we arrive at (jA.Gjl where M is defined in (IA.7j) . 


□ 


Proof of Theorem 14.31 


1. It has been proven in Lemma lA.il that { X 2 , ■ ■ ■, x^, A^) : A; = 0,1, ...} is a bounded sequence. 

Therefore, we conclude that Q{w^) is non-empty by the Bolzano-Weierstrass Theorem. Let w* = 
(x^,..., x^. A*) G Q.{w^) be a limit point of {w^ = [x\, ..., x^, A^) : A: = 0, 1,...}. Then there 

exists a subsequence ..., x^, : g = 0 , 1 ,.. .| such that 

Since /*, i = 1,..., A^ — 1, are lower semi-continuous, we obtain that 


—>■ re* as g —)• 00. 


Iminf /i(x A) > /i(x-), i = 1, 2,... , iV. 

From the iterative step (j4.ip - (|4.4l) . we have for any integer k and any i = 1,. 


.N-l, 


„k+l _ 


:= argmin ,..., x -+1 , x*, x^+i,..., x^; A ). 

Xi 

Letting Xj = x* in the above, we get 

r . \k\ ^ r (,y.k+l ™fc+l ™* . \k\ 

? • • • ? *^2 ’ *^ 2 + 15 • • • 5 5 *** 5 *^ 2 —/’ 


i.e., 


/i(4+')-7,A4*') + i 


N-l 


y~! + X/ -I- x^ - 6 


1=1 


< /i«) - + 

Choosing k = kq — 1 in the above inequality and letting q go to -|-oo, we obtain 


2—1 


l=*+i 

Af-l 


y~] + ^* 2 :* -h ^ AjX^ -h x^ - 6 

1=1 l=*+i 


limsup fi (xA) < limsup I — 


q ^+00 


q^+ao 


7 


A,;X, - AiX* 


(A.17) 


- (A^ A,x^ - A,x:)) + 7(x*), (A.18) 


for i = 1 , 2 ,..., — 1 . Here we have used the facts that both the sequence {w^ : A: = 0,1,...} 

is bounded, and 7 is hnite, and that the distance between two successive iterates tends to zero 
()A.5p . and the fact that 


N-l 


N-l 


X/X] AjXj+x%-b— ^ -I-^x^ - x^^^-I--(A^ - A^’*'^). 


1=1 


l=*+i 


l=*+i 
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From (]A.5p we also have x\'^ ^ x* as g —)> oo, hence (|A.18p reduces to 

limsup/i(xf") < fi{x*). 

q^oo 


Therefore, combining with (jA.lTp . fi{x^‘’) tends to fi{x*) as q ^ oo. Therefore, we can conclude 
that 


N 


N-1 


hm ) = lim /, (x^ - M + 


7 


q^oo 


q^oo 

N 


2=1 


2 = 1 


N-1 


^*x7 +x^" -b 


2=1 


N-1 


YM^t) -{>^*,Y -b) + 


2=1 


2=1 


7V-1 


Y +X*f^ -b 


2=1 


= Cy{w*). 

On the other hand, it follows from (lA.SP and (IA.6I) that 

22 ^+1) ^ (0 ,...,0), k^oo. 


(A.19) 

(A.20) 


It implies that (0, ...,0) € 9£-y(x^,..., x^. A*) due to the closeness of dC^. Therefore, w* = 
(x|,..., x^. A*) is a critical point of ^^(xi,..., xw, A). 

2. The proof for this assertion directly follows from Lemma 5 and Remark 5 of [4]. We omit the 
proof here for succinctness. 

3. We define that L* is the finite limit of C.^{x \,..., x^, A^) as k goes to infinity, i.e.. 


L* = lim £.^(xf,...,x^,A^). 


k^oo 


Take w* G Q{w^). There exists a subsequence converging to w* as q goes to infinity. Since 
we have proven that 

lim C^{w^‘‘) = C^{w*), 

q^oo ' ' 

and Cry[w^) is a non-increasing sequence, we conclude that C^{w*) = L*, hence the restriction of 
Cy{xi, ..., xat, A) to Q{w^) equals L*. 


28 




















